Domain Adaptation for Dependency Parsing via Self-Training

نویسندگان

Juntao Yu

Mohab Elkaref

Bernd Bohnet

چکیده

This paper presents a successful approach for domain adaptation of a dependency parser via self-training. We improve parsing accuracy for out-of-domain texts with a self-training approach that uses confidence-based methods to select additional training samples. We compare two confidence-based methods: The first method uses the parse score of the employed parser to measure the confidence into a parse tree. The second method calculates the score differences between the best tree and alternative trees. With these methods, we were able to improve the labeled accuracy score by 1.6 percentage points on texts from a chemical domain and by 0.6 on average on texts of three web domains. Our improvements on the chemical texts of 1.5% UAS is substantially higher than improvements reported in previous work of 0.5% UAS. For the three web domains, no positive results for self-training have been reported before.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adversarial Training for Cross-Domain Universal Dependency Parsing

We describe our submission to the CoNLL 2017 shared task, which exploits the shared common knowledge of a language across different domains via a domain adaptation technique. Our approach is an extension to the recently proposed adversarial training technique for domain adaptation, which we apply on top of a graph-based neural dependency parsing model on bidirectional LSTMs. In our experiments,...

متن کامل

Exploring Self-training and Co-training for Dependency Parsing

We explore the effect of self-training and co-training on Hindi dependency parsing. We use Malt parser, which is a state-ofthe-art Hindi dependency parser, and apply self-training using a large unannotated corpus. For co-training, we use MST parser with comparable accuracy to the Malt parser. Experiments are performed using two types of raw corpora— one from the same domain as the test data and...

متن کامل

Dependency Parsing Domain Adaptation using Transductive SVM

Dependency Parsing domain adaptation involves adapting a dependency parser, trained on an annotated corpus from a given domain (e.g., newspaper articles), to work on a different target domain (e.g., legal documents), given only an unannotated corpus from the target domain. We present a shift/reduce dependency parser that can handle unlabeled sentences in its training set using a transductive SV...

متن کامل

A Pointwise Approach to Training Dependency Parsers from Partially Annotated Corpora

We introduce a word-based dependency parser for Japanese that can be trained from partially annotated corpora, allowing for effective use of available linguistic resources and reduction of the costs of preparing new training data. This is especially important for domain adaptation in a real-world situation. We use a pointwise approach where each edge in the dependency tree for a sentence is est...

متن کامل

Treeblazing: Using External Treebanks to Filter Parse Forests for Parse Selection and Treebanking

We describe “treeblazing”, a method of using annotations from the GENIA treebank to constrain a parse forest from an HPSG parser. Combining this with self-training, we show significant dependency score improvements in a task of adaptation to the biomedical domain, reducing error rate by 9% compared to out-of-domain gold data and 6% compared to self-training. We also demonstrate improvements in ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Domain Adaptation for Dependency Parsing via Self-Training

نویسندگان

چکیده

منابع مشابه

Adversarial Training for Cross-Domain Universal Dependency Parsing

Exploring Self-training and Co-training for Dependency Parsing

Dependency Parsing Domain Adaptation using Transductive SVM

A Pointwise Approach to Training Dependency Parsers from Partially Annotated Corpora

Treeblazing: Using External Treebanks to Filter Parse Forests for Parse Selection and Treebanking

عنوان ژورنال:

اشتراک گذاری